Skip to content

fix(mcp): get_chart_sql drops x_axis on echarts_timeseries_* and only renders one query for mixed_timeseries#39865

Open
aminghadersohi wants to merge 5 commits intoapache:masterfrom
aminghadersohi:aminghadersohi/fix-get-chart-sql-x-axis-mixed-timeseries
Open

fix(mcp): get_chart_sql drops x_axis on echarts_timeseries_* and only renders one query for mixed_timeseries#39865
aminghadersohi wants to merge 5 commits intoapache:masterfrom
aminghadersohi:aminghadersohi/fix-get-chart-sql-x-axis-mixed-timeseries

Conversation

@aminghadersohi
Copy link
Copy Markdown
Contributor

Summary

Fixes two bugs in get_chart_sql introduced after #38700:

Bug 1 — x_axis dropped for echarts_timeseries_*:
For XY/timeseries charts the temporal column is stored in form_data["x_axis"], not in groupby. When _build_query_context_from_form_data built the query dict it only used groupby, so the time axis column was silently missing from the generated SQL.

Bug 2 — mixed_timeseries renders only one query:
mixed_timeseries has two independent series layers (primary: metrics/groupby, secondary: metrics_b/groupby_b). Previously only the primary query was built and passed to QueryContextFactory, so get_chart_sql returned SQL for only half the chart.

Changes

  • _extract_x_axis_col: new helper that normalises x_axis (plain string or adhoc column dict) to a bare column name.
  • _build_single_query_dict: extracted from an inner function so complexity stays within the C901 limit.
  • _build_mixed_timeseries_secondary: new helper that builds the secondary query dict from metrics_b / groupby_b.
  • _build_query_context_from_form_data: detects echarts_timeseries_* and mixed_timeseries viz types; prepends x_axis to columns; passes queries=[primary, secondary] for mixed_timeseries.
  • New TestExtractXAxisCol and TestBuildQueryContextTimeseriesAndMixed test classes.

Testing

pytest tests/unit_tests/mcp_service/chart/tool/test_get_chart_sql.py -v

… renders one query for mixed_timeseries

- Add `_extract_x_axis_col` helper to normalise the x_axis field (plain
  string or adhoc column dict) into a bare column name.

- In `_build_query_context_from_form_data`, detect echarts_timeseries_*
  and mixed_timeseries viz types and prepend the x_axis column to the
  query's columns list so the temporal dimension appears in the generated
  SQL (was previously dropped because x_axis is stored separately from
  groupby in form_data).

- For mixed_timeseries, build a second query dict from metrics_b /
  groupby_b and pass queries=[primary, secondary] to QueryContextFactory.
  Previously only the primary query was built, so the secondary series SQL
  was silently lost.

- Add TestExtractXAxisCol and TestBuildQueryContextTimeseriesAndMixed test
  classes covering both bug fixes.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

❌ Patch coverage is 6.66667% with 42 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.86%. Comparing base (dc1c0f6) to head (8b31f21).
⚠️ Report is 48 commits behind head on master.

Files with missing lines Patch % Lines
superset/mcp_service/chart/tool/get_chart_sql.py 6.66% 42 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #39865      +/-   ##
==========================================
- Coverage   64.35%   63.86%   -0.50%     
==========================================
  Files        2569     2583      +14     
  Lines      134680   136666    +1986     
  Branches    31254    31502     +248     
==========================================
+ Hits        86679    87287     +608     
- Misses      46505    47863    +1358     
- Partials     1496     1516      +20     
Flag Coverage Δ
hive 39.35% <6.66%> (?)
mysql 59.00% <6.66%> (-0.94%) ⬇️
postgres 59.08% <6.66%> (-0.94%) ⬇️
presto 41.05% <6.66%> (-0.38%) ⬇️
python 60.52% <6.66%> (-0.99%) ⬇️
sqlite 58.72% <6.66%> (-0.93%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes regressions in the MCP get_chart_sql tool’s query-context construction so that SQL rendering matches how Superset Explore builds queries for ECharts timeseries and mixed-timeseries charts.

Changes:

  • Add x_axis extraction and prepend it to query columns for echarts_timeseries_* and mixed_timeseries so the temporal axis is not dropped from rendered SQL.
  • Build and pass two query dicts for mixed_timeseries (primary + secondary) so both series layers’ SQL is returned.
  • Improve chart schema validation ergonomics (relax column-name regex constraints, add richer validation error formatting) and expand unit test coverage around these behaviors.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
superset/mcp_service/chart/tool/get_chart_sql.py Adds helpers to include x_axis in query columns and emits 2 queries for mixed_timeseries when building QueryContextFactory input.
superset/mcp_service/chart/validation/schema_validator.py Formats Pydantic validation errors into more actionable per-field details and deduped suggestions.
superset/mcp_service/chart/schemas.py Relaxes strict regex constraints on column-like fields (e.g., digit-prefixed / locale chars).
superset/mcp_service/chart/tool/generate_chart.py Expands tool docstring with additional request examples for several chart types.
tests/unit_tests/mcp_service/chart/tool/test_get_chart_sql.py Adds unit tests for _extract_x_axis_col and for correct query building for ECharts timeseries + mixed timeseries.
tests/unit_tests/mcp_service/chart/test_chart_schemas.py Adds tests verifying relaxed column-name acceptance and continued blocking of obvious XSS/SQL-injection attempts (where applicable).
Comments suppressed due to low confidence (1)

superset/mcp_service/chart/schemas.py:785

  • FilterConfig.column removed the regex constraint, but sanitize_column still calls sanitize_user_input() without check_sql_keywords=True. As a result, values containing SQL metacharacters/comments (e.g. ';', '--', '/*') and dangerous SQL keywords are no longer rejected at validation time, which weakens the injection protections compared to the previous regex. Consider enabling check_sql_keywords=True (similar to ColumnRef.sanitize_name) so relaxed column naming stays compatible with locale chars while still blocking SQL injection patterns.
    column: str = Field(
        ...,
        min_length=1,
        max_length=255,
        # No regex pattern: sanitize_name() already blocks XSS/SQL injection;
        # many valid column names (digit-prefixed, locale chars, etc.) would
        # be rejected by a strict pattern while posing no security risk.
        # Use get_dataset_info to find exact column names.
        validation_alias=AliasChoices("column", "col"),
    )
    op: Literal[
        "=",
        ">",
        "<",
        ">=",
        "<=",
        "!=",
        "LIKE",
        "ILIKE",
        "NOT LIKE",
        "IN",
        "NOT IN",
    ] = Field(
        ...,
        description="LIKE/ILIKE use % wildcards. IN/NOT IN take a list.",
        validation_alias=AliasChoices("op", "operator", "opr"),
    )
    value: str | int | float | bool | list[str | int | float | bool] = Field(
        ...,
        description="For IN/NOT IN, provide a list.",
        validation_alias=AliasChoices("value", "val"),
    )

    @field_validator("column")
    @classmethod
    def sanitize_column(cls, v: str) -> str:
        """Sanitize filter column name to prevent injection attacks."""
        # sanitize_user_input raises ValueError when allow_empty=False (default)
        # so the return value is guaranteed to be a non-None str
        return sanitize_user_input(v, "Filter column", max_length=255)  # type: ignore[return-value]

Comment thread superset/mcp_service/chart/tool/get_chart_sql.py
Comment thread superset/mcp_service/chart/schemas.py
Comment thread tests/unit_tests/mcp_service/chart/tool/test_get_chart_sql.py Outdated
@aminghadersohi aminghadersohi force-pushed the aminghadersohi/fix-get-chart-sql-x-axis-mixed-timeseries branch from afddf60 to 213fffd Compare May 4, 2026 19:30
@pull-request-size pull-request-size Bot added size/L and removed size/XL labels May 4, 2026
…s_secondary to reduce complexity

Move the nested _make_query_dict helper and the inline mixed_timeseries
secondary query logic out of _build_query_context_from_form_data into
module-level functions.  This brings the McCabe complexity of
_build_query_context_from_form_data back within the C901 limit of 10.
@aminghadersohi aminghadersohi force-pushed the aminghadersohi/fix-get-chart-sql-x-axis-mixed-timeseries branch from 213fffd to d099748 Compare May 6, 2026 17:01
@netlify
Copy link
Copy Markdown

netlify Bot commented May 6, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit d099748
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69fb73df20b8da0008f45461
😎 Deploy Preview https://deploy-preview-39865--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

@aminghadersohi aminghadersohi marked this pull request as ready for review May 6, 2026 19:01
@dosubot dosubot Bot added viz:charts:mixed viz:charts:timeseries Related to Timeseries labels May 6, 2026
…ghten types

- _build_mixed_timeseries_secondary: apply time_range_b from form_data to
  the secondary query dict when present, so charts with an independent
  secondary time range produce correct SQL via the fallback path
- Change groupby_b annotation from list[str] to list[Any] to reflect that
  _build_single_query_dict columns accept Any items (e.g. adhoc dicts)
- Add test: secondary time_range overridden when time_range_b is set
- Add test: SQL-expression x_axis (no column_name) returns None
- Fix test class docstring: remove stale PR reference
Comment on lines +191 to +192
if (row_limit := form_data.get("row_limit")) is not None:
qd["row_limit"] = row_limit
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Query dict construction always reads row_limit and never supports row_limit_b, so the secondary mixed_timeseries query cannot honor its own row limit. This causes incorrect SQL LIMIT behavior for query B. Use the secondary-specific limit when building the secondary query. [logic error]

Severity Level: Major ⚠️
- ❌ Secondary series row limits ignored in get_chart_sql output.
- ⚠️ SQL preview shows incorrect LIMIT for mixed_timeseries B.
Steps of Reproduction ✅
1. Configure a `mixed_timeseries` chart whose Explore form_data contains both `row_limit`
and a different `row_limit_b` (per-series secondary row limit). Example persisted
configuration appears in `superset/examples/featured_charts/charts/Mixed.yaml:80-81` where
both `row_limit` and `row_limit_b` are defined.

2. Use the MCP `get_chart_sql` tool for this chart in a context that goes through the
form_data fallback, e.g. request SQL for an unsaved Explore state via `form_data_key` so
`_handle_unsaved_chart_sql` (`superset/mcp_service/chart/tool/get_chart_sql.py:24-56`)
loads the cached form_data and calls `_sql_from_form_data(form_data, chart=None)` at lines
39-56.

3. `_sql_from_form_data` invokes `_build_query_context_from_form_data` (lines 218-305),
which builds the primary query via `_build_single_query_dict(form_data, groupby, metrics)`
and the secondary mixed_timeseries query via `_build_mixed_timeseries_secondary` (lines
196-215). Inside `_build_mixed_timeseries_secondary`, the secondary query dict is
constructed with `qd = _build_single_query_dict(form_data, groupby_b, metrics_b)` at line
212.

4. `_build_single_query_dict` (lines 180-193) always reads `row_limit` from
`form_data.get("row_limit")` and, if set, assigns `qd["row_limit"] = row_limit` (lines
191-192). There is no handling of `row_limit_b`, so both the primary query and secondary
query use the same `row_limit` value. When `ChartDataCommand` executes this `QueryContext`
(`_sql_from_form_data` lines 44-49), the generated SQL for series B uses the primary row
limit instead of the configured `row_limit_b`, making the SQL preview inconsistent with
the chart configuration.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/chart/tool/get_chart_sql.py
**Line:** 191:192
**Comment:**
	*Logic Error: Query dict construction always reads `row_limit` and never supports `row_limit_b`, so the secondary `mixed_timeseries` query cannot honor its own row limit. This causes incorrect SQL LIMIT behavior for query B. Use the secondary-specific limit when building the secondary query.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

groupby_b: list[Any] = [raw_b] if isinstance(raw_b, str) else list(raw_b)
if x_axis_col and x_axis_col not in groupby_b:
groupby_b = [x_axis_col] + groupby_b
qd = _build_single_query_dict(form_data, groupby_b, metrics_b)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The secondary query for mixed_timeseries is built from the original form_data without translating _b controls to base keys, so secondary-only filters (for example adhoc_filters_b after preprocessing) are never applied. This makes query B SQL use primary filter state and can return incorrect results. Build a secondary form-data view (equivalent to retainFormDataSuffix(..., "_b")) before constructing query B. [logic error]

Severity Level: Major ⚠️
- ❌ get_chart_sql misrepresents mixed_timeseries secondary series filters.
- ⚠️ Unsaved mixed_timeseries SQL omits secondary adhoc_filters_b.
Steps of Reproduction ✅
1. Create or edit a `mixed_timeseries` chart in Explore so its form_data (as stored in
`Slice.params` / cached via `GetFormDataCommand`) contains both primary and secondary
series, including `metrics`, `groupby`, `metrics_b`, `groupby_b`, and a secondary-only
adhoc filter in `adhoc_filters_b`. This matches how mixed charts are migrated/built in
`superset/migrations/shared/migrate_viz/processors.py:17-28` where `adhoc_filters_b` is
populated from `adhoc_filters`.

2. From an MCP client, request SQL for this chart using only `form_data_key` (unsaved
state), triggering `_handle_unsaved_chart_sql` at
`superset/mcp_service/chart/tool/get_chart_sql.py:24-56`, which loads the cached form_data
and calls `_sql_from_form_data(form_data, chart=None)` at lines 39-56.

3. `_sql_from_form_data` calls `_build_query_context_from_form_data` (lines 218-305).
Inside that function, adhoc filters are preprocessed by `merge_extra_filters` and
`split_adhoc_filters_into_base_filters` at lines 250-263, but
`split_adhoc_filters_into_base_filters` (`superset/utils/core.py:1387-1400`) only reads
the `adhoc_filters` key, never `adhoc_filters_b`, so secondary-only adhoc filters remain
unprocessed on `form_data` and are not converted into `filters`/`where`/`having`.

4. Still in `_build_query_context_from_form_data`, the secondary query dict for
mixed_timeseries is built by `_build_mixed_timeseries_secondary` (lines 196-215), which
calls `qd = _build_single_query_dict(form_data, groupby_b, metrics_b)` at line 212.
`_build_single_query_dict` (lines 180-193) pulls `filters` and `time_range` from the
shared `form_data`, so the secondary query's `qd["filters"]` is based only on primary
`adhoc_filters` that were split into base filters, while `adhoc_filters_b` is never
applied. When `ChartDataCommand` later runs on this `QueryContext` (`_sql_from_form_data`
lines 44-49), the rendered SQL for query B ignores its intended secondary-only filters.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/chart/tool/get_chart_sql.py
**Line:** 212:212
**Comment:**
	*Logic Error: The secondary query for `mixed_timeseries` is built from the original `form_data` without translating `_b` controls to base keys, so secondary-only filters (for example `adhoc_filters_b` after preprocessing) are never applied. This makes query B SQL use primary filter state and can return incorrect results. Build a secondary form-data view (equivalent to `retainFormDataSuffix(..., "_b")`) before constructing query B.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented May 7, 2026

Code Review Agent Run #a593d2

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: d2add99..1507685
    • superset/mcp_service/chart/tool/get_chart_sql.py
    • tests/unit_tests/mcp_service/chart/tool/test_get_chart_sql.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

…econdary query

The secondary series in mixed_timeseries can have independent row limit
and adhoc filters configured separately from the primary series.

- _build_mixed_timeseries_secondary now accepts engine and uses it to
  process adhoc_filters_b via split_adhoc_filters_into_base_filters,
  replacing the inherited primary filters on the secondary query dict
- row_limit_b overrides row_limit on the secondary query dict when set
- Tests: row_limit_b, adhoc_filters_b each exercised independently
@pull-request-size pull-request-size Bot added size/XL and removed size/L labels May 7, 2026
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented May 8, 2026

Code Review Agent Run #58d06c

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: 1507685..8b31f21
    • superset/mcp_service/chart/tool/get_chart_sql.py
    • tests/unit_tests/mcp_service/chart/tool/test_get_chart_sql.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Copy link
Copy Markdown
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants